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The  Horus  project 


This  quarterly  status  report  covers  activities  of  the  Honis  project  during  the  fourth  quarter  of 
1993.  This  is  also  our  second  semi-annual  progress  report  under  the  present  ARPA  grant.  Because 
these  status  reports  are  intended  to  be  brief  and  our  proposal  was  recently  funded,  we  assume  that 
the  reader  has  some  background  regarding  the  goals  and  status  of  our  effort,  and  focus  instead  on 
technical  accomplishments  during  the  report  period  and  goals  for  the  next  three  months.  Readers 
unfamiliar  with  our  work  could  start  by  reading  some  of  the  papers  cited  below.  The  Isis  overview 
that  appeared  in  Communications  of  the  ACM  in  December  1993  gives  a  good  general  picture  of 
our  past  work. 

This  report  uses  the  new  ONR  reporting  style.  It  discusses  recent  progress,  transitions,  and  recent 
publications. 


Progress 

With  Marzullo  now  solidly  installed  at  U.C.  San  Diego,  our  work  has  two  main  tracks.  The  larger 
of  these  is  the  continuing  Horus  development  effort  at  Cornell  University,  which  has  now  resulted 
in  a  working  system  that  others  have  begun  to  experiment  with.  Near  term  goals  with  Horus  are 
focused  on  developing  an  appropriate  API  for  users  and  embedding  the  system  in  settings  with 
minimal  need  for  general  piurpose  operating  systems  features. 

The  San  Diego  effort  focuses  on  a  real-time  technology  integrated  with  Horus,  called  the  Corto 
subsystem. 

In  addition  to  these  activities,  we  are  involved  in  two  significant  collaborations,  one  with  the  Transis 
project  (located  at  Hebrew  University  in  Jerusalem),  and  one  with  the  Delta-4  group  (located  at 
INESC  in  Portugal). 

The  major  accomplishments  of  this  report  period  are  as  follows: 

•  We  have  continued  to  extend  and  build  upon  the  first  version  of  the  Horus  system,  following 
a  path  similar  to  the  one  used  in  developing  the  older  ISIS  toolkit.  Predictions  of  a  10-  to 
100-fold  performance  improvement  appear  to  be  justified.  With  the  recent  delivery  of  a  new 
release  of  X-kemel  imder  native  Mach,  we  are  resuming  work  on  a  port  of  our  software  that 
would  tun  imder  Mach. 

•  Working  with  Hebrew  University’s  Transis  group,  we  developed  a  new  approach  to  tolerating 
partitions  and  have  integrated  the  necessary  mechanisms  into  Horus.  With  this  code  in  place, 
it  is  possible  to  develop  applications  that  continue  operating  even  during  network  partition 
failures  and  that  automatically  “heal  themselves”  upon  re-establishing  communication. 

•  Also  working  with  Hebrew  University’s  Transis  group,  we  developed  a  way  to  present  systems 
like  Horus  and  Transis  through  the  UNIX  socket  interface,  or  the  Mach  ports  interface.  Our 
approach  has  the  benefit  of  not  requiring  changes  to  UNIX  or  Mach,  and  in  fact  we  plan  to 
run  Horus  on  the  Mach  microkernel  (with  none  of  the  remainder  of  the  operating  system)  as 
an  experiment  during  1994. 

•  We  developed  new  and  much  improved  flow  control  algorithms  for  Isis  and  for  Horus,  which 
also  permit  expanded  use  of  hardware  multicast  when  that  feature  is  available.  Performance 
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exceeds  that  reported  for  any  other  general  purpose  UNIX-based  multicast  algorithm,  al¬ 
though  it  trails  the  performance  obtained  in  the  Amoeba  project  using  “raw”  hardware  (a 
special  purpose  device  driver)  and  in  the  IVansis  project,  which  does  not  support  any  sort  of 
toolkit  or  application  development  environment. 

•  We  developed  a  new  “cellular”  approach  to  presenting  Homs  and  Isis,  as  well  as  other  systems. 
Briefly,  the  idea  starts  with  the  recognition  that  these  technologies  can  only  enhance  reliability, 
not  protect  completely  against  catastrophic  failure.  Indeed,  they  can  even  introduce  new 
types  of  distributed  failures  that  originate  in  bugs  or  deadlocks  in  our  own  code,  although, 
obviously,  we  do  everything  we  can  to  minimize  this!  A  cellular  presentation  of  a  system 
divides  the  system  into  multiple  cells,  using  very  high  performance  gateways  for  inter-cell 
communication.  Our  cellular  approach  is  for  process-group  systems,  and  our  gateways  can 
be  used  between  Homs  cells  or  Isis  cells,  but  can  also  be  used  to  connect  Homs  to  Isis,  and 
indeed  to  connect  either  of  these  systems  to  IVansis  or  Delta-4.  We  are  currently  writing  a 
paper  on  this  approach.  Cells  are  also  useful  in  multi-level  security  domains.  An  example 
of  how  the  architecture  will  be  used  arises  in  the  Frendi  Air  IVaflSc  Control  System.  In  this 
system,  each  position  (3  consoles  operated  by  a  team  of  flight  controllers)  would  be  a  single 
cell;  the  system-wide  management  and  monitoring  system  would  be  another  cell.  Such  an 
architecture  scales  very  well,  and  it  weakens  the  dependency  between  software  on  different 
cells.  With  care,  it  should  be  possible  to  design  this  system  to  tolerate  even  a  completely 
arbitrary  failure  in  one  cell  -  other  cells  would  just  keep  running  normally,  perhaps  spooling 
some  data  for  transmission  to  the  failed  cell  when  it  recovers.  We  view  this  as  a  big  conceptual 
advance  for  the  project  and  one  that  is  likely  to  have  significant  impact  on  the  feasibility  of 
using  Homs  and  Isis  in  very  large  settings  or  in  very  critical  ones. 

•  We  are  making  good  progress  towards  the  design  and  implementation  of  Corto,  the  real¬ 
time  layer.  The  lowest  layers  of  this  system  consist  of  three  major  components:  a  clock 
synchronization  service,  a  real-time  multicast  transport  service,  and  a  network  scheduling 
layer  that  synchronizes  with  the  multicast  transport  layer  and  the  POSlX-compliant  threads 
scheduling  facility.  This  software  is  running  on  top  of  Lynx,  a  commercial  real-time  kernel, 
but  we  are  designing  the  system  to  also  run  on  top  of  any  of  the  real-time  Mach  systems  as 
they  become  available  (we  expect  to  migrate  to  RT-Mach  by  the  beginning  of  the  Summer). 

Early  in  our  design  process,  it  became  clear  that  a  TDMA-based  approach  (sudi  as  used  by 
Mars,  an  important  and  highly  influencial  system  recently  built  by  Hermann  Kopetz  at  the 
Universit  of  Vienna)  would  provide  the  best  performance  and  schedulability  as  compared  to 
either  sequencing  site  protocols  (such  as  Homs  or  Amoeba)  or  train-based  protocols  (such 
as  IVansis  and  Total).  It  aJso  was  clear  that  the  optimization  provided  by  CBCAST  in  the 
asynchronous  Homs  suite  of  protocols  is  not  applicable  to  in  a  real-time  domain.  In  addition, 
we  found  that  an  end-to-end  argument  could  be  used  to  double  the  bandwidth  (at  a  cost  of 
increased  maximum  latency)  of  a  TDMA-based  approach.  Our  resulting  real-time  multicast 
transport  protocol  is  fast,  dependable  in  the  face  of  processor  crashes  and  various  ommission 
failmes,  and  is  highly  portable.  We  are  now  working  on  how  to  make  the  assignment  of 
TDMA  slots  dynamic  in  order  to  guarantee  hard  real-time  delivery  requirements  in  the  face 
of  changing  membership  and  changing  mode.  We  are  using  a  simple  approach  that  will  not 
be  as  efficient  as  that  provided  by  Mars  but  will  be  much  more  adaptable.  Our  hope  is  to 
eventually  adapt  the  on-line  scheduling  techniques  developed  by  Christos  Papadimitriou  of 
U.C.  San  Diego  to  slot  scheduling. 
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We  have  had  the  prototype  protocols  n\nning  for  approximately  three  months  now,  and 
are  mainly  attacking  problems  with  operating  system,  hardware  and  scheduling  constraints. 
For  example,  we  wish  to  have  Corto  be  as  portable  as  possible  across  conunericial  POSIX- 
compliant  Unix  platforms.  Hence,  we  have  built  the  real-time  transport  service  on  top  of 
UDP,  using  a  single  Ethernet  and  using  the  standard  Unix  sleep  library  routine.  This  ap¬ 
proach  places  a  lower  bound  on  the  TDMA  slot  of  10  msec  (unless  one  staggers  the  sleep 
interrupts.  We  have  done  so,  thereby  reducing  the  slot  to  2.5  msec  for  a  4-processor  sys¬ 
tem,  but  the  approach  obviously  doesn’t  scale  well).  In  any  case,  the  prototype  maintains 
clock  synchronization  of  approximately  300  microseconds  with  resyndironization  occuring 
approximately  twice  a  minute,  and  message  stability  occurs  in  two  TDMA  slots. 

•  We  completed  several  papers  (see  below  under  “publications”)  and  a  book. 

•  Continuation  of  research  ties  with  other  laboratories,  including  the  Los  Alamos  Advanced 
Computing  Laboratory  (which  focuses  on  supercomputing),  the  Isreali  Transis  project,  Por¬ 
tugal’s  INESC  research  laboratory  (known  for  its  work  on  realtime  communication),  and 
with  Mach-related  research  efforts  at  the  Open  Software  Foundation,  Carnegie  Mellon,  and 
University  of  Arizona. 

•  We  continued  the  implementation  of  the  security  architecture  for  Horns,  by  extending  the 
existing  code  to  include  a  secure  name  service. 

•  We  have  continued  our  new  effort  to  explore  specialized  implementations  of  Horns  for  par¬ 
allel  processors  and  for  ATM  networks.  This  work  is  motivated  by  the  impressive  results  of 
Berkeley’s  Split/C  and  Active  Messages  research,  demonstrating  that  asynchronous  commu¬ 
nication  can  lead  to  tremendous  performance  gains  on  the  most  important  emerging  parallel 
processors.  In  a  very  exciting  development,  the  main  developer  of  the  Active  Messages  system 
(Thorsten  von  Eiken)  joined  our  project  during  the  fall,  as  a  faculty  member  in  the  Cornell 
Dept,  of  Computer  Science.  Brian  Smith,  who  has  worked  on  multimedia  file  servers,  will 
join  us  shortly.  This  has  created  the  critical  mass  for  a  push  that  will  move  Homs  onto  highly 
parallel  platforms,  and  onto  advanced  high  performance  communication  environments.  We 
want  to  build  om  protocols  in  ways  that  exploit  the  hardware  fully  and  minimize  unnecessary 
work  in  software  -  work  needed  on  networks  but  not  on  closely  coupled  machines.  We  are 
very  pleased  with  this  new  direction. 
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TVansitions 


Our  project  is  perhaps  unique  among  distributed  systems  efforts  in  the  United  States  in  the  degree 
of  success  we  have  had  with  technology  transfer. 

Technology  transfer 

During  the  fourth  quarter  of  1993,  Stratus  Computer  Inc.  of  Boston,  a  company  specialized  in 
availability  technologies,  acquired  Isis  Distributed  Systems,  our  spin-off  company  that  has  focused 
on  commercializing  Isis.  The  acquisition  came  at  a  time  of  a  great  success  for  Isis,  which  has  been 
selected  for  use  in  settings  like  the  New  York  Stock  Exchange,  the  French  Air  Traffic  Control  System, 
the  Swiss  Electronic  Bourse  (ATB/EBS),  Iridium,  Sematech’s  factory  floor  system,  and  a  great 
number  of  other  high  visibility,  demanding  applications.  These  include  a  number  of  U.S.  government 
and  military  applications,  of  which  the  Hiper-D  project  (follow-on  to  AEGIS)  is  most  visible,  but 
extending  to  at  least  a  dozen  similar  efforts  in  every  agency  of  the  miltary  and  government. 

Stratus  is  firmly  committed  to  availability,  through  software  and  hardware,  and  views  Isis  as  the 
center  of  its  future  strategy  in  building  continuously  available  computing  systems  and  highly  avail¬ 
able  distributed  software.  The  company  has  stated  emphatically  that  it  will  continue  to  port  Isis 
to  a  wide  variety  of  vendor  platforms,  including  both  UNIX  systems  and  non-UNIX  environments. 
The  company  will  also  be  exploring  ways  to  exploiting  emerging  hardware  such  as  ATM  technology 
that  has  the  potential  to  dovetail  with  Horus,  and  has  obtained  rights  to  commercialize  Horns 
through  Cornell  University.  The  committment  to  a  heterogeneous  presentation  of  Isis  and  Horus 
has  been  repeatedly  stressed  by  Stratus,  which  intends  to  port  Isis  and  Horus  to  a  wider  range 
of  platforms  (notably,  PC’s)  while  maintaining  the  tedinology  on  the  current  platforms  (mostly 
UNIX  workstations  and  VMS). 

Through  this  development,  Isis  and  Horus  are  clearly  entering  the  mainstream  and  will  have  greatly 
enhanced  impact  on  the  economic  mission  of  the  country,  just  at  a  time  when  the  demand  for 
reliability  in  “data  highway”  applications  is  becoming  acute.  We  believe  that  this  is  a  great  success 
story  for  ARPA  and  ONR:  a  proof  that  the  decade-long  investment  by  ARPA  in  this  technology 
area  has  created  the  basis  for  a  new  industry  that  has  become  self-sustaining  and  accepted. 

The  Stratus  acquisition  will  cause  some  reorganization  within  the  Horus  effort  at  Cornell,  but 
nothing  drastic  is  expected  to  change.  Birman  will  consult  for  Stratus  in  the  role  of  Chief  Scientist 
for  the  Isis  effort,  but  this  is  recognized  as  a  part-time  job,  and  is  not  expected  to  create  more  load 
than  Birman’s  work  as  President  of  Isis  over  the  past  few  years.  Birman  will  continue  to  head  the 
Horus  project,  where  the  focus  now  is  on  arriving  at  a  mature  system  that  Stratus  can  pick  up, 
while  also  striking  in  a  new  direction  that  would  explore  ATM  networking  and  parallel  computing. 

Cooper  will  be  leaving  Cornell  to  head  the  technology  development  group  within  Isis  at  Stratus, 
but  will  remain  in  Ithaca  2uid  will  continue  to  work  closely  with  us.  We  currently  plan  to  fill  his 
position  with  a  post-doctoral  student.  Van  Renesse  will  remain  at  Cornell,  but  will  consult  for 
Stratus  to  assist  in  technology  transition  for  Horus. 

Stratus  is  committed  to  maintaining  access  to  Isis  and  Horus  for  resesurch  users  in  academic  settings 
and  other  settings.  The  structure  appears  to  be  an  ideal  route  for  transition  of  current  and  future 
work  by  our  research  effort. 
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Hiper-D  effort 


Isis  continues  to  work  closely  with  the  HiperD  program,  which  is  now  being  moved  to  the  Navy 
and  will  become  an  advanced  prototyping  effort  under  the  overall  AEGIS  R&D  effort.  This  work 
seems  to  be  moving  forward  rapidly,  and  has  adopted  a  reasonable  compromise  between  needing 
to  use  robust  existing  technology  (Isis,  Mach)  and  wanting  to  exploit  emerging  platforms  like  the 
Paragon.  Isis  Distributed  Systems  will  maintain  a  significant  effort  in  this  area,  and  will  continue 
to  provide  any  necessary  support  to  the  HiperD  developers  at  JH/APL  and  elsewhere. 

Collaborations 

As  noted  above,  Horus  has  excited  wide  interest  in  the  research  and  advanced  development  com¬ 
munity.  We  mountain  close  ties  to  dozens  of  other  efforts,  and  are  sharing  technology  with  several 
national  laboratories,  supercomputing  projects  at  Los  Alamos  Laboratories,  Sandia,  and  NASA 
JPL,  and  are  exploring  ties  with  a  number  of  commercial  prototyping  efforts. 
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FY95  $956,187 


Publications 


Below,  we  reproduce  a  list  of  recent  publications  by  the  effort.  A  good  general  review  of  the  project 
is  the  article  that  appeared  in  the  December  issue  of  Commimications  of  the  ACM.  We  have  also 
Just  completed  a  book  that  will  be  published  by  IEEE  Press,  and  collects  the  most  important 
papers  Isis  papers  together  with  about  50%  previously  unpublished  material,  as  a  single  volume. 
The  book  will  appear  early  in  1994. 


PUBLICATIONS  LIST 


ISIS  Activity 

•  Keith  Marzullo  and  Ida  Szafranska.  Monitoring  and  Controlling  Distributed  Applications 
using  Lomita.  IEEE  First  International  Workshop  on  Systems  Management,  14-16  April 
1993. 

•  Ken  Birman.  A  Response  to  Cheriton  and  Skeen's  Criticism  of  Caused  and  Totally  Ordered 
Communication.  Published  in  Operating  Systems  Review  January,  1994. 

•  Lorenzo  Alvisi,  Bruce  Hoppe  and  Keith  Marzxillo.  Nonblocking  and  Orphan-Free  Message 
Logging  Protocols.  Accepted  for  presentation  at  FTCS. 

•  Ozalp  Babaoglu,  Keith  Marzullo  and  Fred  B.  Schneider.  Priority  Inversion  and  its  Prevention. 
Accepted  for  publication  in  Journal  of  Real-Time  Systems,  volume  5,  number  4. 

•  The  Process  Group  Approach  to  Reliable  Distributed  Computing.  Kenneth  P.  Birman.  Com¬ 
munications  of  the  ACM,  36:12  (Dec.  1993). 

•  Andr4  Schiper,  Aleta  Ricciardi  and  Kenneth  Birman.  Virtually-Synchronous  Communication 
Based  on  a  Weak  Failure  Suspector.  June  22-24,  1993,  Toulouse,  Prance. 

•  Robbert  van  Renesse.  Why  Bother  with  CATOCS?  Published  in  Operating  Systems  Review, 
January,  1994,  Vol.  28,  No  1. 

•  Robert  Cooper.  Experience  with  Causally  and  Totally  Ordered  Communication  Support. 
Published  in  Operating  Systems  Review,  January,  1994,  Vol  28,  No  1. 

•  Robbert  van  Rennesse  and  Dag  Johansen.  Distributed  Systems  in  Perspective.  Published  in 
Distributed  Open  Systems. 

•  Kenneth  Birman  and  Brad  Glade.  Consistent  Failure  Reporting  in  Reliable  Communication 
Systems.  Submitted  October,  1993. 

•  Kenneth  Birman,  A  book  to  be  published  in  the  IEEE  Press  in  early  1994.  This  books  collects 
the  most  important  Isis  papers  along  with  previously  unpublished  material. 

•  Robbert  van  Renesse  and  Kenneth  Birman.  Fault-Tolerant  Programming  Using  Process 
Groups,  Published  in  Distributed  Open  Systems 

•  Robbert  van  Renesse  and  Dag  Johansen.  Software  Structures  for  Supporting  Distributed 
Computing.  Published  in  Distributed  Open  Systems. 
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•  Keith  Marzullo  and  M.D.  Wood.  Tools  for  Monitoring  and  Controlling  Distributed  Applica¬ 
tions.  Published  in  Distributed  Open  Systems. 

•  Bradford  B.  Glade,  Kenneth  P.  Birman,  Robert  Cooper,  and  Robbert  van  Renesse.  Light- 
Weight  Process  Groups  in  the  ISIS  System.  Published  in  Distributed  Systems  Engineering 
Journal,  July  1993. 
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